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\ 1 primary CONTACT 6Q«+3297I ,124202 

Who should IP&S contact for further technical information about the invention and information 
about its planned use or public disclosure? 

Inventor Name: Geoffrey Burns 



2. PRESENT STAGE OF THE INVENTION 

□ Idea Research □ Development □ Manufacture 

3. GOVERNMENT CONTRACT INVENTION 

Was the invention made under a government contract? □ Yes ^ No 

4. PLEASE PROVIDE A TWO OR THREE SENTENCE SUMMARY OF 
YOUR INVENTION and include and underline KEY WORDS which 
might be useful in searching for relevant patents or publications: 

A field programmable processor array is described which can be embedded in a system on a chip. 
Similar to a field programmable gate array macro, the fppa can be included in a system on chip 
to accommodate new circuit designs post silicon, but with improved density and performance 
relative to currently available embedded field programmable gate arrays. 



5. PRESENT STATE OF THE ART 

Briefly describe the closest already-known technology that relates to the invention. This would 
include, for example, already existing products, methods or compositions which are known to you 
personally or through descriptions in publications. 

Embedded field programmable gate arrays are available from a limited number of vendors, and 
offer post silicon reconfiourabilitv in systems on a chip. These macros still offer poor design 
density and unpredictable clock speed, particularly for high speed demodulation functions in digital 

(ADD LINES AS NECESSARY. IF COMPLETING ON COMPUTER. OR ATTACH ADDITIONAL PAGES) 

6. ADVANCEMENT IN STATE OF THE ART 

Briefly describe the unique advancement achieved by the invention. This may be done, for 
example, by describing a problem with the prior art that is solved or specific objects that are 
achieved by the invention. 

The field programmable gate array is replaced with a programmable systolic array, which can 
more efficiently accommodate the signal processing in a digital radio. The systolic array is 
bounded bv border cells which provide dataflow synchronization to the system, allowing simple 
inte gration. A simple softw are programming flow is possible using this system, as opposed to the 
proprietary hardware design flow characteristic of FPGA's. : 

(ADD UNES AS NECESSARY, IF COMPLETING ON COMPUTER, OR ATTACH ADDITIONAL PAGES) 



7. WHAT IS THE BEST WAY YOU KNOW OF TO IMPLEMENT THE INVENTION? 

Briefly describe the invention and how it achieves the advancement described in paragraph 7. 
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(ADD LINES AS NECESSARY, IF COMPLETING ON COMPUTER, OR ATTACH ADDITIONAL PAGES) 

**"*"Py^SENOI§: IF WE DECIDE TO FILE AN APPLICATION ON THIS INVENTION, THE 
ATTORNEY WRITING THE APPLICATION WILL NEED THIS INFORMATION FROM YOU 
IN AS MUCH DETAIL AS POSSIBLE IN ORDER TO COMPLETE THE APPLICATION. 

DISCLOSURE OUTSIDE OF PHILIPS 

If the invention has been or will be disclosed publicly or to anyone other than a Philips' employee, 
describe to whom (person / company), date and where. 
Adelante Technologies(potentially) 
Ibiquity Digital radio 



9. PUBLICATION 

Has a description of the invention been published or submitted for publication? □ Yes 
If "yes", please list each occurrence: 

Date Publication/Submission 



ISI No 



10. PLEASE INDICATE THE PRODUCT OR SERVICE IN WHICH YOUR INVENTION MOST 
LIKELY WILL BE USED : 

IC's for broadcast channel decoding, wireless LAN physical layer, and cellular 
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1 Introduction 

In [1] a programmable systolic array structure was proposed to provide an efficient, high- 
performance, and flexible architecture for signal processing functions in the front end of a 
programmable digital transceiver. The array can support programs implementing 
programmable digital filters, as well as other important signal processing kernels. Figure 
1 illustrates the essential elements of the architecture. 




Figure 1: Reconfigurable Processor Array architecture. A programmable systolic array 
for programmable digital radio front ends. 

The array is currently programmed using assembly language entry, with the assistance of 
a programming graphical user interface tool. The tool generates an image of the array 
program, mapped to each cell. The program image would reside in an on chip memory, 
_ then would-be-downloaded-to-the array-over. a shared bus^in-a-random-access-fashion^l] — 
It is straightforward to extend this methodology to include design entry from a signal 
flow graph editor and simulation system such as the commercial offerings Simulink, 
SPW, Cossap. This methodology is illustrated in Figure 2. 
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Figure 2: array programming flow 



2 Modularization of processor array within embedded FPGA 
footprint 

In scenarios where we would like to insert programmable signal chain into an existing 
chain to support unanticipated changes, an embedded FPGA is often touted to provide the 
solution. However, the poor macro density and clock speed still renders this scheme 
expensive. An alternative would be to embed the processor array in a similar embedded 
FPGA footprint. As illustrated in Figure 3, a simple routing scheme can be designed to 
allow insertion or replacement of functions in a signal chain with those realized on the 
processor array. 
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In order to allow the embedded macro to be connected to the system in a flexible way, the 
underlying array is terminated by border control cells that exchange data with the array 
according to its nearest neighbor interconnection scheme. The border control cells are 
connected to external I/O circuits in a reconfigurable manner. This reconfigurable 
interconnection can be realized using a crossbar network, or else via a local selection 
mechanism in each border cell. The external I/O circuits then provide the connection 
points to the external circuitry. , 




Figure 4: Illustrates some of the features of the interface structure that would provide an 
embedded macro footprint The interface scheme entails one I/O pad per unterminated 
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array cell, one border control cell per unterminated anray cell, and a configurable 
multiplexing or crossbar scheme to interconnect the I/O pads with the border cells. 

The synchronization scheme between the external system and internal array is an 
important issue. External synchronization is often dataflow driven, which means 
functions are initiated upon the arrival of one or more valid data samples, often signaled 
using data valid signals. The internal synchronization scheme is systolic, meaning each 
processor runs in lock step. It would preferable for the embedded macro to conform to a 
dataflow synchronization mechanism. A more general scheme can be the ring buffer 
scheme often used in signal processing simulation systems such as Ptolemy, SPW, 
Simulink, or COSSAP. In this scheme, each active input is parameterized to indicate the 
quantity of valid samples that must be accumulated before the function is triggered. The 
output is similarly parameterized, except the parameter indicates the quantity of samples 
that are generated per function call. This synchronization scheme is outlined in Figure 5 
below. 




Figure 5: Characteristics of dataflow synchronization model, to be provided by the array 
boundary. 



To realize the dataflow synchronization scheme, a border cell design is proposed that 
derives from the internal array cell. For exchange of data with the internal array, the 
border cell must maintain the manual synchronization method used between cells in the 
array. To simultaneously support ring buffer synchronization, the border cell must also 
accumulate and count incoming valid samples, and signal when the required quantity has 
arrived. Furthermore, the border cell must externally transmit all accumulated samples 
during array program execution. Using the internal array cell as a starting point, the 
desired behavior can be achieved by modifying the processor internal register file, such 
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that data can be exchanged with the environment. According to Figure 6 below, an I/O 
register buffer is appended to the processor design to achieve this behavior. For exchange 
of data with the array, the border cell can be programmed in the same manner, and with 
the same array tools, as used in the internal array. For exchange of data with the external 
environment, a configurable state machine within the I/O register buffer is employed. 




Figure 6: Border cell, which synchronizes the array interface with the external system 
using a dataflow mechanism. The cell is a modification of the internal cell, in order to 
maintain a uniform programming model with the programmable array. 

Figure 8 illustrates the concepts underlying the I/O register buffer, and Figure 8 illustrates 
the buffer's interaction with die processor. A register space is divided into two partitions. 
One partition is for registers mapped to the ring buffer input, while the other maps to the 
ring buffer output. Furthermore, each register file partition is swapped between control 
-ofth^O^egister-buffer^tote^-m 
Under processor control, the register file is available to exchange data with the array 
using the normal nearest neighbor communication mechanism. Otherwise the register 
subspace is accumulating input samples, or discharging output samples under control of 
the write and read controls. The read control, in particular, must count the incoming 
samples, store them in the proper registers, signal when the programmed quantity of 
inputs have arrived, then swap the register mappings between processor and I/O control. 
The output control must empty the output registers and toggle the data valid signal for the 
external process. 
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Figure 7: Detail of I/O register buffer, which provides the synchronization between the 
internal manually-syncrhonized array and the external dataflow synchronization 
environment. 




Figure 8: Detail of border processor core, which supports the dataflow synchronization 
using the internal cell programming model 



Another essential element to control the array is a master control cell, which is illustrated 
in Figure 9. The master cell drives the control bus connected to each cell in the array. 
This cell then has two roles. During configuration, the cell must pass the array program 
data from the system controller to the array program bus. During operation, the master 
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cell triggers the array functionality by transmitting an execute command over the control 
bus once all active input border cells signal a valid buffer. 




Figure 9: Master cell, which drives the array control bus according to incoming 
configuration information (during configuration), and ring buffer valid signals (during 
steady state). 
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